Counterexamples for Expected Rewards
نویسندگان
چکیده
The computation of counterexamples for probabilistic systems has gained a lot of attention during the last few years. All of the proposed methods focus on the situation when the probabilities of certain events are too high. In this paper we investigate how counterexamples for properties concerning expected costs (or, equivalently, expected rewards) of events can be computed. We propose methods to extract a minimal subsystem which already leads to costs beyond the allowed bound. Besides these exact methods, we present heuristic approaches based on path search and on best-first search, which are applicable to very large systems when deriving a minimum subsystem becomes infeasible due to the system size. Experiments show that we can compute counterexamples for systems with millions of states.
منابع مشابه
COUNTEREXAMPLES IN CHAOTIC GENERALIZED SHIFTS
In the following text for arbitrary $X$ with at least two elements, nonempty countable set $Gamma$ we make a comparative study on the collection of generalized shift dynamical systems like $(X^Gamma,sigma_varphi)$ where $varphi:GammatoGamma$ is an arbitrary self-map. We pay attention to sub-systems and combinations of generalized shifts with counterexamples regarding Devaney, exact Dev...
متن کاملStochastic Bounded Model Checking: Bounded Rewards and Compositionality
We extend the available SAT/SMT-based methods for generating counterexamples of probabilistic systems in two ways: First, we propose bounded rewards, which are appropriate, e. g., to model the energy consumption of autonomous embedded systems, and show how to extend the SMT-based counterexample generation to handle such models. Second, we describe a compositional SAT encoding of the transition ...
متن کاملTotal Expected Discounted Reward MDPs: Existence of Optimal Policies
This article describes the results on the existence of optimal and nearly optimal policies for Markov Decision Processes (MDPs) with total expected discounted rewards. The problem of optimization of total expected discounted rewards for MDPs is also known under the name of discounted dynamic programming.
متن کاملSize and probability of rewards modulate the feedback error-related negativity associated with wins but not losses in a monetarily rewarded gambling task
Feedback error-related negativity (fERN) has been referred to as a negative deflection in the event related potential (ERP), which distinguishes between wins and losses in terms of expected and unexpected outcomes. Some studies refer to the "expected outcome" as the probability to win vs. to lose, and others as expected size of rewards. We still do not know much about whether these alternative ...
متن کاملOn an Index Policy for Restless Bandits
We investigate the optimal allocation of effort to a collection of n projects. The projects are 'restless' in that the state of a project evolves in time, whether or not it is allocated effort. The evolution of the state of each project follows a Markov rule, but transitions and rewards depend on whether or not the project receives effort. The objective is to maximize the expected time-average ...
متن کامل